A critic-critic architecture to combine reinforcement and supervised learnings

نویسندگان

  • Fabien Montagne
  • Samuel Delepoulle
  • Philippe Preux
چکیده

In real life, learning is greatly speeded-up by the intervention of a teacher who gives examples, or shows, how to perform a certain task. In all this abstract, we let apart structural simpli cations of the problem by the designer which to not deal explicitely with learning. The intervention of the teacher can be realized in di erent ways: verbal explanation, demonstration, guidance, shaping the behavior, ... see e.g. (Jordan, 1986; Gaussier et al., 1997; Hugues & Drogoul, 2001; Rosenstein & Barto, 2002).

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Learning to Learn: Meta-Critic Networks for Sample Efficient Learning

We propose a novel and flexible approach to meta-learning for learning-to-learn from only a few examples. Our framework is motivated by actor-critic reinforcement learning, but can be applied to both reinforcement and supervised learning. The key idea is to learn a meta-critic: an action-value function neural network that learns to criticise any actor trying to solve any specified task. For sup...

متن کامل

1 Supervised Actor - Critic Reinforcement Learning

Editor’s Summary: Chapter ?? introduced policy gradients as a way to improve on stochastic search of the policy space when learning. This chapter presents supervised actor-critic reinforcement learning as another method for improving the effectiveness of learning. With this approach, a supervisor adds structure to a learning problem and supervised learning makes that structure part of an actor-...

متن کامل

G Uide a Ctor - C Ritic for C Ontinuous C Ontrol

Actor-critic methods solve reinforcement learning problems by updating a parameterized policy known as an actor in a direction that increases an estimate of the expected return known as a critic. However, existing actor-critic methods only use values or gradients of the critic to update the policy parameter. In this paper, we propose a novel actor-critic method called the guide actor-critic (GA...

متن کامل

Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations

Pretraining with expert demonstrations have been found useful in speeding up the training process of deep reinforcement learning algorithms since less online simulation data is required. Some people use supervised learning to speed up the process of feature learning, others pretrain the policies by imitating expert demonstrations. However, these methods are unstable and not suitable for actor-c...

متن کامل

Fast Learning in an Actor-Critic Architecture with Reward and Punishment

A reinforcement architecture is introduced that consists of three complementary learning systems with different generalization abilities. The ACTOR learns state-action associations, the CRITIC learns a goal-gradient, and the PUNISH system learns what actions to avoid. The architecture is compared to the standard actor-crititc and Q-learning models on a number of maze learning tasks. The novel a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003